In this file, we presented the results from analyses of exposures
against computed epigenetic age accelerations (EAA) calculated using DNA
methylation data by different methods. For each exposure, we conducted
both primary analyses (likelihood ratio tests and linear regressions,
adjusted for confounders) and sensitivity analyses (likelihood ratio
tests and/or linear regressions limited to certain fuel users, not
adjusted for confounders). In addition to what’s included in the
analysis plan, we also analyzed ambient and urinary exposures.
Current (self-reported) fuel type
The numbers of observations with each current fuel type:
##
## Smokeles Smoky Wood_and_or_Plant
## 17 87 8
Primary analysis
Investigate the association with current (self-reported) fuel type in
the LEX study participants, adjusting for known confounders and stove
ventilation. The reference group for this analysis would be the smoky
coal users. This would be a categorical analysis, and the results would
be a p-value from the likelihood ratio (LR) test of a confounder-only
model to a model including the exposure variables, as well as p-values
for the contrast of each category of coal use (smokeless coal or
plant/wood) to that of smoky coal. FDR correction should be used
separately for each of these sets. The main interest would be in the
coal-specific findings and perhaps less so in the results from the LR
test.
Likelihood ratio (LR) test (mix model)
Full model: \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * I(\text{Smokeles}) + \beta_2 *
I(\text{Wood_and_or_Plant}) \\
& + \beta_3 * county + \beta_4 * BMI + \beta_5 * ses + \beta_6 *
edu + \beta_7 * curStove + \epsilon
\end{aligned}
\] Nested model: \[
\begin{aligned}
Y = & \beta_0 \\
& + \beta_1 * county + \beta_2 * BMI + \beta_3 * ses + \beta_4 *
edu + \beta_5 * curStove + \epsilon
\end{aligned}
\] \(H_0\): The full model and
the nested model fit the data equally well. Thus, you should use the
nested model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.2368 0.6340
## Hannum EAA 0.6304 0.6340
## PhenoAge EAA 0.5142 0.6340
## Skin&Blood EAA 0.4887 0.6340
## GrimAge EAA 0.0279 0.2232
## DNAmTL 0.5250 0.6340
## IEAA 0.3694 0.6340
## EEAA 0.6340 0.6340
Linear regression
In the following section, we performed linear regression with
equation \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * I(\text{Smokeles}) + \beta_2 *
I(\text{Wood_and_or_Plant}) \\
& + \beta_3 * county + \beta_4 * BMI + \beta_5 * ses + \beta_6 *
edu + \beta_7 * curStove + \epsilon
\end{aligned}
\] where \(Y\) is one of the
epigenetic age accelerations and the reference as the smoky fuel
type.
The estimations of \(\beta_0\),
\(\beta_1\) and \(\beta_2\) with given \(Y\) are shown below. The \(\beta_1\) and \(\beta_2\) can be interpreted as “the
expected change of Y if switching form the smoky fuel type to the given
fuel type, while holding other variables constant”.

Sensitivity analyses
Likelihood ratio (LR) test (single model)
Limit the analyses in the primary analysis to include only a single
observation from each subject (no need for a mixed model). The rationale
for this is that it is not so easy to obtain unbiased p-values from a
mixed model for FDR testing. This can be remediated during FDR testing
but would be good to check.
Full model: \[Y = \beta_0 + \beta_1 *
I(\text{Smokeles}) + \beta_2 * I(\text{Wood_and_or_Plant}) +
\epsilon\] Nested model: \[Y = \beta_0
+ \epsilon\] \(H_0\): The full
model and the nested model fit the data equally well. Thus, you should
use the nested model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.2800 0.8200
## Hannum EAA 0.4890 0.8819
## PhenoAge EAA 0.8936 0.8936
## Skin&Blood EAA 0.5512 0.8819
## GrimAge EAA 0.1672 0.8200
## DNAmTL 0.8624 0.8936
## IEAA 0.3075 0.8200
## EEAA 0.6635 0.8847
Linear relation
Use a trend test to estimate a linear relation across use categories
(1=wood, 2=smokeless coal, 3=smoky coal). Fit the equation: \[Y = \beta_0 + \beta_1 * fuel\_type +
\epsilon\]
## coefficient std pval pval_BHadj
## AgeAccelerationResidual -1.03 0.74 0.17 0.17
## AgeAccelerationResidualHannum -0.70 0.64 0.28 0.37
## AgeAccelPheno -0.06 0.65 0.93 0.93
## DNAmAgeSkinBloodClockAdjAge -0.08 0.53 0.88 0.88
## AgeAccelGrim -0.11 0.47 0.81 0.82
## DNAmTLAdjAge -0.02 0.03 0.60 0.60
## IEAA -0.98 0.67 0.15 0.15
## EEAA -0.72 0.81 0.38 0.47

Cumulative lifetime (self-reported) fuel type
The numbers of observations with each cumulative lifetime fuel
type:
##
## Mix Smoky
## 82 37
Primary analysis
Likelihood ratio (LR) test (mix model)
Full model: \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * I(\text{Mix}) \\
& + \beta_2 * county + \beta_3 * BMI + \beta_4 * ses + \beta_5 *
edu + \beta_6 * curStove + \epsilon
\end{aligned}
\] Nested model: \[
\begin{aligned}
Y = & \beta_0 \\
& + \beta_1 * county + \beta_2 * BMI + \beta_3 * ses + \beta_4 *
edu + \beta_5 * curStove + \epsilon
\end{aligned}
\] \(H_0\): The full model and
the nested model fit the data equally well. Thus, you should use the
nested model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.3222 0.5415
## Hannum EAA 0.4061 0.5415
## PhenoAge EAA 0.6397 0.7311
## Skin&Blood EAA 0.9331 0.9331
## GrimAge EAA 0.0245 0.1960
## DNAmTL 0.3396 0.5415
## IEAA 0.0940 0.3760
## EEAA 0.2773 0.5415
Linear regression
In the following section, we performed linear regression with
equation \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * I(\text{Mix}) \\
& + \beta_2 * county + \beta_3 * BMI + \beta_4 * ses + \beta_5 *
edu + \beta_6 * curStove + \epsilon
\end{aligned}
\]
where \(Y\) is one of the epigenetic
age accelerations and the reference as the smoky fuel type.
The estimations of \(\beta_0\) and
\(\beta_1\) with given \(Y\) are shown below. The \(\beta_1\) can be interpreted as “the
expected change of Y if switching form the smoky fuel type to the mix
fuel type, while holding other variables constant”.

Sensitivity analyses
Likelihood ratio (LR) test (single model)
Full model: \[Y = \beta_0 + \beta_1 *
I(\text{Mix}) + \epsilon\]
Nested model: \[Y = \beta_0 +
\epsilon\]
\(H_0\): The full model and the
nested model fit the data equally well. Thus, you should use the nested
model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.3532 0.8414
## Hannum EAA 0.7909 0.8414
## PhenoAge EAA 0.8253 0.8414
## Skin&Blood EAA 0.8414 0.8414
## GrimAge EAA 0.1805 0.7220
## DNAmTL 0.6405 0.8414
## IEAA 0.0759 0.6072
## EEAA 0.6484 0.8414
Linear relation
Use a trend test to estimate a linear relation across use categories
(1=mix, 2=Smoky coal). Fit the equation: \[Y
= \beta_0 + \beta_1 * fuel\_type + \epsilon\]
## coefficient std pval pval_BHadj
## AgeAccelerationResidual -0.88 0.95 0.36 0.36
## AgeAccelerationResidualHannum 0.21 0.80 0.79 0.79
## AgeAccelPheno 0.18 0.81 0.83 0.83
## DNAmAgeSkinBloodClockAdjAge 0.14 0.69 0.84 0.84
## AgeAccelGrim 0.75 0.57 0.19 0.19
## DNAmTLAdjAge -0.02 0.04 0.64 0.64
## IEAA -1.52 0.86 0.08 0.16
## EEAA 0.46 1.02 0.65 0.65

Childhood (self-reported) fuel type
The numbers of observations with each current fuel type:
##
## Mix Smokeles Smoky Wood
## 53 5 47 11
Primary analysis
Likelihood ratio (LR) test (mix model)
Full model: \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * I(\text{Wood}) + \beta_2 *
I(\text{Smokeles}) + \beta_3 * I(\text{Mix}) \\
& + \beta_4 * county + \beta_5 * BMI + \beta_6 * ses + \beta_7 *
edu + \beta_8 * curStove + \epsilon
\end{aligned}
\] Nested model: \[
\begin{aligned}
Y = & \beta_0 \\
& + \beta_1 * county + \beta_2 * BMI + \beta_3 * ses + \beta_4 *
edu + \beta_5 * curStove + \epsilon
\end{aligned}
\] \(H_0\): The full model and
the nested model fit the data equally well. Thus, you should use the
nested model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.0412 0.1099
## Hannum EAA 0.1426 0.1901
## PhenoAge EAA 0.2872 0.3282
## Skin&Blood EAA 0.1345 0.1901
## GrimAge EAA 0.0051 0.0408
## DNAmTL 0.4625 0.4625
## IEAA 0.0379 0.1099
## EEAA 0.1276 0.1901
Linear regression
In the following section, we performed linear regression with
equation \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * I(\text{Wood}) + \beta_2 *
I(\text{Smokeles}) + \beta_3 * I(\text{Mix}) \\
& + \beta_4 * county + \beta_5 * BMI + \beta_6 * ses + \beta_7 *
edu + \beta_8 * curStove + \epsilon
\end{aligned}
\] where \(Y\) is one of the
epigenetic age accelerations and the reference as the smoky fuel
type.
The estimations of \(\beta_0\),
\(\beta_1\), \(\beta_2\), and \(\beta_3\) with given \(Y\) are shown below. The \(\beta_1\), \(\beta_2\), and \(\beta_3\) can be interpreted as “the
expected change of Y if switching form the smoky fuel type to the given
fuel type, while holding other variables constant”.

Sensitivity analyses
Likelihood ratio (LR) test (single model)
Limit the analyses in the primary analysis to include only a single
observation from each subject (no need for a mixed model). The rationale
for this is that it is not so easy to obtain unbiased p-values from a
mixed model for FDR testing. This can be remediated during FDR testing
but would be good to check.
Full model: \[Y = \beta_0 + \beta_1 *
I(\text{Wood}) + \beta_2 * I(\text{Smokeles}) + \beta_3 * I(\text{Mix})
+ \epsilon\] Nested model: \[Y =
\beta_0 + \epsilon\] \(H_0\):
The full model and the nested model fit the data equally well. Thus, you
should use the nested model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.2833 0.6101
## Hannum EAA 0.3813 0.6101
## PhenoAge EAA 0.8336 0.8336
## Skin&Blood EAA 0.7398 0.8336
## GrimAge EAA 0.0146 0.1168
## DNAmTL 0.5919 0.7892
## IEAA 0.1220 0.4880
## EEAA 0.3340 0.6101
Linear relation
Use a trend test to estimate a linear relation across use categories
(1=wood, 2=smokeless coal, 3 = mix coal, 4=smoky coal). Fit the
equation: \[Y = \beta_0 + \beta_1 *
fuel\_type + \epsilon\]
## coefficient std pval pval_BHadj
## AgeAccelerationResidual -0.70 0.50 0.16 0.16
## AgeAccelerationResidualHannum -0.50 0.42 0.24 0.30
## AgeAccelPheno -0.13 0.43 0.77 0.93
## DNAmAgeSkinBloodClockAdjAge 0.02 0.36 0.95 0.95
## AgeAccelGrim 0.27 0.30 0.37 0.37
## DNAmTLAdjAge 0.01 0.02 0.62 0.99
## IEAA -0.87 0.44 0.05 0.06
## EEAA -0.52 0.53 0.33 0.39

Clusters based on model-based exposure estimates at or shortly
before the visit (clusCUR6)
The file “LEX_clusCUR6.csv” has information on current pollutant
exposures, obtained for the 2 years preceding the visit. To reduce
multi-collinearity between exposures, exposure prototypes were derived
based on hierarchical cluster analysis in combination followed by
principal components analysis. These estimates are available for 6
different prototypes (cluster variables) for a total of 161 subjects and
211 visits. The prototypes are labelled as:
CUR6_BC_PAH6 – Black carbon (BC) and 6 PAHs
CUR6_PAH31 – a large cluster of 31 PAHs
CUR6_NkF – NkF only
CUR6_PM_RET – Particulate matter (PM) and retene
CUR6_NO2 – NO2 only
CUR6_SO2 – SO2 only
Summary the exposure estimates:
| Characteristic |
Overall, N = 112 |
Smokeles, N = 17 |
Smoky, N = 87 |
Wood_and_or_Plant, N = 8 |
| CUR6_BC_PAH6 |
0.79 (-0.5, 0.8) |
-1.32 (-1.4, -0.9) |
0.80 (-0.2, 1.1) |
0.69 (0.1, 0.7) |
| (Missing) |
3 |
2 |
1 |
0 |
| CUR6_PAH31 |
0.38 (-0.4, 0.6) |
-1.14 (-1.4, -0.5) |
0.46 (-0.1, 0.6) |
0.75 (0.4, 0.8) |
| (Missing) |
3 |
2 |
1 |
0 |
| CUR6_NkF |
-0.40 (-0.6, 0.7) |
0.06 (-0.2, 0.3) |
-0.51 (-0.6, 0.9) |
0.74 (-0.2, 0.7) |
| (Missing) |
3 |
2 |
1 |
0 |
| CUR6_PM_RET |
-0.32 (-0.5, 0.4) |
-0.04 (-0.9, 0.3) |
-0.32 (-0.5, 0.1) |
2.49 (0.9, 2.6) |
| (Missing) |
3 |
2 |
1 |
0 |
| CUR6_NO2 |
0.06 (-0.4, 0.8) |
1.00 (0.6, 1.4) |
-0.06 (-0.5, 0.5) |
0.63 (-0.2, 1.3) |
| (Missing) |
3 |
2 |
1 |
0 |
| CUR6_SO2 |
-0.30 (-0.9, 0.3) |
1.37 (0.2, 1.5) |
-0.30 (-0.9, 0.1) |
-1.00 (-1.3, -0.9) |
| (Missing) |
3 |
2 |
1 |
0 |
Primary analysis
Likelihood ratio (LR) test (mix model)
Full model: \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * \text{BC_PAH6} + \beta_2 * \text{PAH31}
+ \beta_3 * \text{NkF} + \beta_4 * \text{PM_RET} + \beta_5 * \text{NO2}
+ \beta_6 * \text{SO2}\\
& + \beta_7 * county + \beta_8 * BMI + \beta_9 * ses + \beta_{10}
* edu + \beta_{11} * curStove + \epsilon
\end{aligned}
\] Nested model: \[
\begin{aligned}
Y = & \beta_0 \\
& + \beta_1 * county + \beta_2 * BMI + \beta_3 * ses + \beta_4 *
edu + \beta_5 * curStove + \epsilon
\end{aligned}
\] \(H_0\): The full model and
the nested model fit the data equally well. Thus, you should use the
nested model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.1900 0.2533
## Hannum EAA 0.0239 0.0504
## PhenoAge EAA 0.0210 0.0504
## Skin&Blood EAA 0.1401 0.2242
## GrimAge EAA 0.0085 0.0504
## DNAmTL 0.2939 0.3359
## IEAA 0.4320 0.4320
## EEAA 0.0252 0.0504
Linear regression
In the following section, we performed linear regression with
equation \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * \text{BC_PAH6} + \beta_2 * \text{PAH31}
+ \beta_3 * \text{NkF} + \beta_4 * \text{PM_RET} + \beta_5 * \text{NO2}
+ \beta_6 * \text{SO2}\\
& + \beta_7 * county + \beta_8 * BMI + \beta_9 * ses + \beta_{10}
* edu + \beta_{11} * curStove + \epsilon
\end{aligned}
\] where \(Y\) is one of the
epigenetic age accelerations.
The estimations of \(\beta_1\),
\(\beta_2\), \(\beta_3\), \(\beta_4\), \(\beta_5\), and \(\beta_6\) with given \(Y\) are shown below. The \(\beta_1\), \(\beta_2\), \(\beta_3\), \(\beta_4\), \(\beta_5\), and \(\beta_6\) can be interpreted as “the
expected change of Y if increase one unit of given exposure prototype,
while holding other variables constant”.

Sensitivity analyses
Likelihood ratio (LR) test (single model)
Full model: \[Y = \beta_0 + \beta_1 *
\text{BC_PAH6} + \beta_2 * \text{PAH31} + \beta_3 * \text{NkF} + \beta_4
* \text{PM_RET} + \beta_5 * \text{NO2} + \beta_6 * \text{SO2} +
\epsilon\]
Nested model: \[Y = \beta_0 +
\epsilon\]
\(H_0\): The full model and the
nested model fit the data equally well. Thus, you should use the nested
model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.1840 0.2968
## Hannum EAA 0.2914 0.3775
## PhenoAge EAA 0.0241 0.0755
## Skin&Blood EAA 0.0283 0.0755
## GrimAge EAA 0.0263 0.0755
## DNAmTL 0.4823 0.4823
## IEAA 0.3303 0.3775
## EEAA 0.1855 0.2968
Likelihood ratio (LR) test (single model) with subjects using only
smoky or smokeless coal
Full model: \[Y = \beta_0 + \beta_1 *
\text{BC_PAH6} + \beta_2 * \text{PAH31} + \beta_3 * \text{NkF} + \beta_4
* \text{PM_RET} + \beta_5 * \text{NO2} + \beta_6 * \text{SO2} +
\epsilon\]
Nested model: \[Y = \beta_0 +
\epsilon\]
\(H_0\): The full model and the
nested model fit the data equally well. Thus, you should use the nested
model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.4226 0.4830
## Hannum EAA 0.1558 0.2707
## PhenoAge EAA 0.0209 0.1672
## Skin&Blood EAA 0.1692 0.2707
## GrimAge EAA 0.0806 0.2149
## DNAmTL 0.2510 0.3347
## IEAA 0.6041 0.6041
## EEAA 0.0626 0.2149
Likelihood ratio (LR) test (single model) with subjects only using
smoky coal
Full model: \[Y = \beta_0 + \beta_1 *
\text{BC_PAH6} + \beta_2 * \text{PAH31} + \beta_3 * \text{NkF} + \beta_4
* \text{PM_RET} + \beta_5 * \text{NO2} + \beta_6 * \text{SO2} +
\epsilon\]
Nested model: \[Y = \beta_0 +
\epsilon\]
\(H_0\): The full model and the
nested model fit the data equally well. Thus, you should use the nested
model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.0700 0.0800
## Hannum EAA 0.0037 0.0148
## PhenoAge EAA 0.0093 0.0248
## Skin&Blood EAA 0.0426 0.0800
## GrimAge EAA 0.0651 0.0800
## DNAmTL 0.2166 0.2166
## IEAA 0.0509 0.0800
## EEAA 0.0019 0.0148
Clusters based on model-based exposure estimates accrued before age
18 (clusCHLD5)
The file “LEX_clusCHLD5.csv” has information on estimated pollutant
exposures during early childhood. Estimates are available for 5
different prototypes (cluster variables) for a total of 161 subjects and
211 visits. The prototypes are labelled as:
CHLD5_X7 – a cluster of 7 air pollutants
CHLD5_X33 – a large cluster of 33 air pollutants
CHLD5_NkF – NkF only
CHLD5_NO2 – NO2 only
CHLD5_SO2 – SO2 only
Summary the exposure estimates:
| Characteristic |
Overall, N = 112 |
Smokeles, N = 17 |
Smoky, N = 87 |
Wood_and_or_Plant, N = 8 |
| CHLD5_X7 |
0.09 (-0.5, 0.5) |
-0.63 (-0.9, -0.1) |
0.10 (-0.5, 0.3) |
0.86 (0.7, 1.1) |
| (Missing) |
3 |
2 |
1 |
0 |
| CHLD5_X33 |
0.23 (-0.7, 1.1) |
-0.83 (-1.4, 0.1) |
0.51 (-0.4, 1.2) |
0.95 (-0.1, 1.0) |
| (Missing) |
3 |
2 |
1 |
0 |
| CHLD5_NkF |
-0.21 (-0.8, 0.7) |
0.06 (-0.3, 0.7) |
-0.45 (-1.0, 0.5) |
1.07 (0.5, 1.5) |
| (Missing) |
3 |
2 |
1 |
0 |
| CHLD5_NO2 |
0.34 (-0.5, 0.8) |
0.17 (-0.5, 0.9) |
0.43 (-0.6, 0.8) |
-0.21 (-0.3, 0.2) |
| (Missing) |
3 |
2 |
1 |
0 |
| CHLD5_SO2 |
0.34 (-0.7, 0.4) |
0.45 (0.3, 1.4) |
0.34 (-0.9, 0.4) |
0.22 (-0.2, 0.3) |
| (Missing) |
3 |
2 |
1 |
0 |
Primary analysis
Likelihood ratio (LR) test (mix model)
Full model: \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * \text{X7} + \beta_2 * \text{X33} +
\beta_3 * \text{NkF} + \beta_4 * \text{NO2} + \beta_5 * \text{SO2}\\
& + \beta_6 * county + \beta_7 * BMI + \beta_8 * ses + \beta_{9} *
edu + \beta_{10} * curStove + \epsilon
\end{aligned}
\] Nested model: \[
\begin{aligned}
Y = & \beta_0 \\
& + \beta_1 * county + \beta_2 * BMI + \beta_3 * ses + \beta_4 *
edu + \beta_5 * curStove + \epsilon
\end{aligned}
\] \(H_0\): The full model and
the nested model fit the data equally well. Thus, you should use the
nested model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.4899 0.5599
## Hannum EAA 0.1305 0.2088
## PhenoAge EAA 0.0576 0.1782
## Skin&Blood EAA 0.0716 0.1782
## GrimAge EAA 0.0120 0.0960
## DNAmTL 0.5692 0.5692
## IEAA 0.4260 0.5599
## EEAA 0.0891 0.1782
Linear regression
In the following section, we performed linear regression with
equation \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * \text{X7} + \beta_2 * \text{X33} +
\beta_3 * \text{NkF} + \beta_4 * \text{NO2} + \beta_5 * \text{SO2}\\
& + \beta_6 * county + \beta_7 * BMI + \beta_8 * ses + \beta_{9} *
edu + \beta_{10} * curStove + \epsilon
\end{aligned}
\] where \(Y\) is one of the
epigenetic age accelerations.
The estimations of \(\beta_1\),
\(\beta_2\), \(\beta_3\), \(\beta_4\), and \(\beta_5\) with given \(Y\) are shown below. The \(\beta_1\), \(\beta_2\), \(\beta_3\), \(\beta_4\), and \(\beta_5\) can be interpreted as “the
expected change of Y if increase one unit of given exposure prototype,
while holding other variables constant”.

Sensitivity analyses
Likelihood ratio (LR) test (single model)
Full model: \[Y = \beta_0 + \beta_1 *
\text{X7} + \beta_2 * \text{X33} + \beta_3 * \text{NkF} + \beta_4 *
\text{NO2} + \beta_5 * \text{SO2} + \epsilon\]
Nested model: \[Y = \beta_0 +
\epsilon\]
\(H_0\): The full model and the
nested model fit the data equally well. Thus, you should use the nested
model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.8864 0.8864
## Hannum EAA 0.2901 0.5840
## PhenoAge EAA 0.1416 0.5664
## Skin&Blood EAA 0.3650 0.5840
## GrimAge EAA 0.0208 0.1664
## DNAmTL 0.5466 0.7288
## IEAA 0.6847 0.7825
## EEAA 0.3074 0.5840
Likelihood ratio (LR) test (single model) with subjects using only
smoky or smokeless coal
Full model: \[Y = \beta_0 + \beta_1 *
\text{X7} + \beta_2 * \text{X33} + \beta_3 * \text{NkF} + \beta_4 *
\text{NO2} + \beta_5 * \text{SO2} + \epsilon\]
Nested model: \[Y = \beta_0 +
\epsilon\]
\(H_0\): The full model and the
nested model fit the data equally well. Thus, you should use the nested
model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.9805 0.9805
## Hannum EAA 0.3840 0.7341
## PhenoAge EAA 0.0700 0.2800
## Skin&Blood EAA 0.1867 0.4979
## GrimAge EAA 0.0634 0.2800
## DNAmTL 0.5506 0.7341
## IEAA 0.7515 0.8589
## EEAA 0.4588 0.7341
Likelihood ratio (LR) test (single model) with subjects only using
smoky coal
Full model: \[Y = \beta_0 + \beta_1 *
\text{X7} + \beta_2 * \text{X33} + \beta_3 * \text{NkF} + \beta_4 *
\text{NO2} + \beta_5 * \text{SO2} + \epsilon\]
Nested model: \[Y = \beta_0 +
\epsilon\]
\(H_0\): The full model and the
nested model fit the data equally well. Thus, you should use the nested
model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.9268 0.9268
## Hannum EAA 0.3979 0.6366
## PhenoAge EAA 0.0655 0.3264
## Skin&Blood EAA 0.2424 0.6366
## GrimAge EAA 0.0816 0.3264
## DNAmTL 0.3621 0.6366
## IEAA 0.7955 0.9091
## EEAA 0.4873 0.6497
Clusters based on model-based lifetime exposure estimates
(clusCUM6)
The file “LEX_clus CUM6.csv” has information on estimated cumulative
pollutant exposures during the lifecourse. Estimates are available for 6
different prototypes (cluster variables) for a total of 161 subjects and
211 visits. The prototypes are labelled as:
CUM6_BC_NO2_PM – a cluster of BC, NO2, and PM
CUM6_PAH36 – a large cluster of 36 PAHs
CUM6_DlP – DlP only
CUM6_NkF – NkF only
CUM6_RET – retene only
CUM6_SO2 – SO2 only
Summary the exposure estimates:
| Characteristic |
Overall, N = 112 |
Smokeles, N = 17 |
Smoky, N = 87 |
Wood_and_or_Plant, N = 8 |
| CUM6_BC_NO2_PM |
0.22 (-0.6, 0.8) |
0.19 (-0.3, 0.7) |
0.10 (-1.0, 0.8) |
1.38 (0.4, 1.6) |
| (Missing) |
3 |
2 |
1 |
0 |
| CUM6_PAH36 |
0.25 (-0.6, 1.1) |
-1.00 (-1.2, -0.3) |
0.32 (-0.5, 1.2) |
0.83 (0.4, 1.4) |
| (Missing) |
3 |
2 |
1 |
0 |
| CUM6_DlP |
-0.48 (-1.0, 0.8) |
0.65 (0.5, 1.1) |
-0.66 (-1.2, 0.7) |
0.42 (0.3, 0.6) |
| (Missing) |
3 |
2 |
1 |
0 |
| CUM6_NkF |
-0.22 (-0.8, 0.5) |
-0.07 (-0.3, 0.4) |
-0.31 (-1.0, 0.4) |
1.18 (0.1, 1.7) |
| (Missing) |
3 |
2 |
1 |
0 |
| CUM6_RET |
-0.22 (-0.7, 0.3) |
-0.41 (-0.9, 0.3) |
-0.25 (-0.8, 0.2) |
1.71 (1.2, 1.9) |
| (Missing) |
3 |
2 |
1 |
0 |
| CUM6_SO2 |
0.09 (-0.4, 0.4) |
1.13 (0.5, 1.6) |
-0.03 (-0.9, 0.3) |
-0.02 (-0.6, 0.1) |
| (Missing) |
3 |
2 |
1 |
0 |
Primary analysis
Likelihood ratio (LR) test (mix model)
Full model: \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * \text{BC_NO2_PM} + \beta_2 *
\text{PAH36} + \beta_3 * \text{DlP} + \beta_4 * \text{NkF} + \beta_5 *
\text{RET} + \beta_6 * \text{SO2}\\
& + \beta_7 * county + \beta_8 * BMI + \beta_9 * ses + \beta_{10}
* edu + \beta_{11} * curStove + \epsilon
\end{aligned}
\] Nested model: \[
\begin{aligned}
Y = & \beta_0 \\
& + \beta_1 * county + \beta_2 * BMI + \beta_3 * ses + \beta_4 *
edu + \beta_5 * curStove + \epsilon
\end{aligned}
\] \(H_0\): The full model and
the nested model fit the data equally well. Thus, you should use the
nested model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.4844 0.4844
## Hannum EAA 0.3926 0.4844
## PhenoAge EAA 0.0933 0.3732
## Skin&Blood EAA 0.2405 0.3885
## GrimAge EAA 0.0011 0.0088
## DNAmTL 0.2155 0.3885
## IEAA 0.4703 0.4844
## EEAA 0.2428 0.3885
Linear regression
In the following section, we performed linear regression with
equation \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * \text{BC_NO2_PM} + \beta_2 *
\text{PAH36} + \beta_3 * \text{DlP} + \beta_4 * \text{NkF} + \beta_5 *
\text{RET} + \beta_6 * \text{SO2}\\
& + \beta_7 * county + \beta_8 * BMI + \beta_9 * ses + \beta_{10}
* edu + \beta_{11} * curStove + \epsilon
\end{aligned}
\] where \(Y\) is one of the
epigenetic age accelerations.
The estimations of \(\beta_1\),
\(\beta_2\), \(\beta_3\), \(\beta_4\), \(\beta_5\), and \(\beta_6\) with given \(Y\) are shown below. The \(\beta_1\), \(\beta_2\), \(\beta_3\), \(\beta_4\), \(\beta_5\), and \(\beta_6\) can be interpreted as “the
expected change of Y if increase one unit of given exposure prototype,
while holding other variables constant”.

Sensitivity analyses
Likelihood ratio (LR) test (single model)
Full model: \[Y = \beta_0 + \beta_1 *
\text{BC_NO2_PM} + \beta_2 * \text{PAH36} + \beta_3 * \text{DlP} +
\beta_4 * \text{NkF} + \beta_5 * \text{RET} + \beta_6 * \text{SO2} +
\epsilon\]
Nested model: \[Y = \beta_0 +
\epsilon\]
\(H_0\): The full model and the
nested model fit the data equally well. Thus, you should use the nested
model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.5551 0.7401
## Hannum EAA 0.8488 0.8488
## PhenoAge EAA 0.1559 0.4157
## Skin&Blood EAA 0.2862 0.5202
## GrimAge EAA 0.0170 0.1360
## DNAmTL 0.1043 0.4157
## IEAA 0.3251 0.5202
## EEAA 0.7581 0.8488
Likelihood ratio (LR) test (single model) with subjects using only
smoky or smokeless coal
Full model: \[Y = \beta_0 + \beta_1 *
\text{BC_NO2_PM} + \beta_2 * \text{PAH36} + \beta_3 * \text{DlP} +
\beta_4 * \text{NkF} + \beta_5 * \text{RET} + \beta_6 * \text{SO2} +
\epsilon\]
Nested model: \[Y = \beta_0 +
\epsilon\]
\(H_0\): The full model and the
nested model fit the data equally well. Thus, you should use the nested
model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.6381 0.6381
## Hannum EAA 0.5536 0.6342
## PhenoAge EAA 0.0248 0.1984
## Skin&Blood EAA 0.1313 0.2626
## GrimAge EAA 0.1039 0.2626
## DNAmTL 0.0790 0.2626
## IEAA 0.4141 0.6342
## EEAA 0.5549 0.6342
Likelihood ratio (LR) test (single model) with subjects only using
smoky coal
Full model: \[Y = \beta_0 + \beta_1 *
\text{BC_NO2_PM} + \beta_2 * \text{PAH36} + \beta_3 * \text{DlP} +
\beta_4 * \text{NkF} + \beta_5 * \text{RET} + \beta_6 * \text{SO2} +
\epsilon\]
Nested model: \[Y = \beta_0 +
\epsilon\]
\(H_0\): The full model and the
nested model fit the data equally well. Thus, you should use the nested
model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.6380 0.6380
## Hannum EAA 0.3701 0.4914
## PhenoAge EAA 0.0243 0.1944
## Skin&Blood EAA 0.0878 0.2054
## GrimAge EAA 0.1027 0.2054
## DNAmTL 0.0826 0.2054
## IEAA 0.4300 0.4914
## EEAA 0.3562 0.4914
Clusters based on pollutant measurements (clusMEAS6)
The file “LEX_clusMEAS6.csv” has information on measured pollutant
exposures during each visit. Estimates are available for 6 different
prototypes (cluster variables) for a total of 54 subjects and 54 visits.
The prototypes are labelled as:
MEAS6_BC_ PM_RET – a cluster of BC, PM, and retene
MEAS6_X31 – a large cluster of 31 air pollutants
MEAS6_X5 – a smaller cluster of 5 air pollutants
MEAS6_DlP – DlP only
MEAS6_NkF – NkF only
MEAS6_ NO2_SO2 – NO2, and SO2
Summary the exposure estimates:
| Characteristic |
Overall, N = 112 |
Smokeles, N = 17 |
Smoky, N = 87 |
Wood_and_or_Plant, N = 8 |
| MEAS6_BC_PM_RET |
0.05 (-0.6, 0.5) |
-0.40 (-1.6, -0.3) |
0.07 (-0.5, 0.5) |
1.08 (0.5, 2.1) |
| (Missing) |
70 |
10 |
57 |
3 |
| MEAS6_X31 |
0.19 (-0.6, 0.7) |
-1.02 (-1.8, -0.8) |
0.31 (-0.1, 0.8) |
0.35 (-0.5, 0.8) |
| (Missing) |
70 |
10 |
57 |
3 |
| MEAS6_X5 |
-0.14 (-1.0, 1.0) |
-1.07 (-1.1, -1.0) |
0.46 (-0.8, 1.1) |
0.55 (-0.1, 0.9) |
| (Missing) |
70 |
10 |
57 |
3 |
| MEAS6_DlP |
-0.63 (-0.7, 1.3) |
0.35 (-0.6, 1.0) |
-0.69 (-0.7, 1.2) |
-0.30 (-0.5, 1.3) |
| (Missing) |
70 |
10 |
57 |
3 |
| MEAS6_NkF |
-0.50 (-0.6, 1.2) |
-0.39 (-0.6, 0.6) |
-0.50 (-0.6, 1.2) |
-0.50 (-0.7, 0.2) |
| (Missing) |
70 |
10 |
57 |
3 |
| MEAS6_NO2_SO2 |
-0.08 (-0.9, 0.8) |
0.98 (0.5, 1.5) |
-0.37 (-0.9, 0.8) |
-0.37 (-1.3, 0.2) |
| (Missing) |
70 |
10 |
57 |
3 |
Primary analysis
Likelihood ratio (LR) test (mix model)
Full model: \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * \text{BC_PM_RET} + \beta_2 * \text{X31}
+ \beta_3 * \text{X5} + \beta_4 * \text{DlP} + \beta_5 * \text{NkF} +
\beta_6 * \text{NO2_SO2}\\
& + \beta_7 * county + \beta_8 * BMI + \beta_9 * ses + \beta_{10}
* edu + \beta_{11} * curStove + \epsilon
\end{aligned}
\] Nested model: \[
\begin{aligned}
Y = & \beta_0 \\
& + \beta_1 * county + \beta_2 * BMI + \beta_3 * ses + \beta_4 *
edu + \beta_5 * curStove + \epsilon
\end{aligned}
\] \(H_0\): The full model and
the nested model fit the data equally well. Thus, you should use the
nested model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.0198 0.0568
## Hannum EAA 0.0494 0.0859
## PhenoAge EAA 0.0958 0.1277
## Skin&Blood EAA 0.0027 0.0216
## GrimAge EAA 0.5043 0.5043
## DNAmTL 0.1552 0.1774
## IEAA 0.0537 0.0859
## EEAA 0.0213 0.0568
Linear regression
In the following section, we performed linear regression with
equation \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * \text{BC_PM_RET} + \beta_2 * \text{X31}
+ \beta_3 * \text{X5} + \beta_4 * \text{DlP} + \beta_5 * \text{NkF} +
\beta_6 * \text{NO2_SO2}\\
& + \beta_7 * county + \beta_8 * BMI + \beta_9 * ses + \beta_{10}
* edu + \beta_{11} * curStove + \epsilon
\end{aligned}
\] where \(Y\) is one of the
epigenetic age accelerations.
The estimations of \(\beta_1\),
\(\beta_2\), \(\beta_3\), \(\beta_4\), \(\beta_5\), and \(\beta_6\) with given \(Y\) are shown below. The \(\beta_1\), \(\beta_2\), \(\beta_3\), \(\beta_4\), \(\beta_5\), and \(\beta_6\) can be interpreted as “the
expected change of Y if increase one unit of given exposure prototype,
while holding other variables constant”.

Sensitivity analyses
Likelihood ratio (LR) test (single model)
Full model: \[Y = & \beta_0 + \beta_1
* \text{BC_PM_RET} + \beta_2 * \text{X31} + \beta_3 * \text{X5} +
\beta_4 * \text{DlP} + \beta_5 * \text{NkF} + \beta_6 * \text{NO2_SO2} +
\epsilon\]
Nested model: \[Y = \beta_0 +
\epsilon\]
\(H_0\): The full model and the
nested model fit the data equally well. Thus, you should use the nested
model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.1034 0.1654
## Hannum EAA 0.0550 0.1306
## PhenoAge EAA 0.0653 0.1306
## Skin&Blood EAA 0.0353 0.1306
## GrimAge EAA 0.1928 0.2571
## DNAmTL 0.2487 0.2842
## IEAA 0.5142 0.5142
## EEAA 0.0263 0.1306
Likelihood ratio (LR) test (single model) with subjects using only
smoky or smokeless coal
Full model: \[Y = \beta_0 + \beta_1 *
\text{BC_PAH6} + \beta_2 * \text{PAH31} + \beta_3 * \text{NkF} + \beta_4
* \text{PM_RET} + \beta_5 * \text{NO2} + \beta_6 * \text{SO2} +
\epsilon\]
Nested model: \[Y = \beta_0 +
\epsilon\]
\(H_0\): The full model and the
nested model fit the data equally well. Thus, you should use the nested
model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.0715 0.1834
## Hannum EAA 0.1314 0.2102
## PhenoAge EAA 0.2403 0.3204
## Skin&Blood EAA 0.0446 0.1834
## GrimAge EAA 0.0917 0.1834
## DNAmTL 0.4322 0.4322
## IEAA 0.3579 0.4090
## EEAA 0.0624 0.1834
Likelihood ratio (LR) test (single model) with subjects only using
smoky coal
Full model: \[Y = & \beta_0 + \beta_1
* \text{BC_PM_RET} + \beta_2 * \text{X31} + \beta_3 * \text{X5} +
\beta_4 * \text{DlP} + \beta_5 * \text{NkF} + \beta_6 * \text{NO2_SO2} +
\epsilon\]
Nested model: \[Y = \beta_0 +
\epsilon\]
\(H_0\): The full model and the
nested model fit the data equally well. Thus, you should use the nested
model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.1992 0.3187
## Hannum EAA 0.1754 0.3187
## PhenoAge EAA 0.2713 0.3617
## Skin&Blood EAA 0.0873 0.2472
## GrimAge EAA 0.0319 0.2472
## DNAmTL 0.4242 0.4848
## IEAA 0.6646 0.6646
## EEAA 0.0927 0.2472
Clusters based on urinary biomarkers (clusURI5)
The file “LEX_clusURI5.csv” has information on measured urinary
biomarkers obtained during each visit. Estimates are available for 5
different prototypes (cluster variables) for a total of 163 subjects and
186 visits. The prototypes are labelled as:
URI5_NAP_1M_2M – a cluster of Naphthalene, 1Methylnaphthalene, and
2Methylnaphthalene
URI5_ACE – Acenaphthene only
URI5_FLU_PHE – Fluoranthene and Phenanthrene_anth
URI5_PYR – Pyrene only
URI5_CHR – Baa_Chrysene only
Summary the exposure estimates:
| Characteristic |
Overall, N = 112 |
Smokeles, N = 17 |
Smoky, N = 87 |
Wood_and_or_Plant, N = 8 |
| MEAS6_BC_PM_RET |
0.05 (-0.6, 0.5) |
-0.40 (-1.6, -0.3) |
0.07 (-0.5, 0.5) |
1.08 (0.5, 2.1) |
| (Missing) |
70 |
10 |
57 |
3 |
| MEAS6_X31 |
0.19 (-0.6, 0.7) |
-1.02 (-1.8, -0.8) |
0.31 (-0.1, 0.8) |
0.35 (-0.5, 0.8) |
| (Missing) |
70 |
10 |
57 |
3 |
| MEAS6_X5 |
-0.14 (-1.0, 1.0) |
-1.07 (-1.1, -1.0) |
0.46 (-0.8, 1.1) |
0.55 (-0.1, 0.9) |
| (Missing) |
70 |
10 |
57 |
3 |
| MEAS6_DlP |
-0.63 (-0.7, 1.3) |
0.35 (-0.6, 1.0) |
-0.69 (-0.7, 1.2) |
-0.30 (-0.5, 1.3) |
| (Missing) |
70 |
10 |
57 |
3 |
| MEAS6_NkF |
-0.50 (-0.6, 1.2) |
-0.39 (-0.6, 0.6) |
-0.50 (-0.6, 1.2) |
-0.50 (-0.7, 0.2) |
| (Missing) |
70 |
10 |
57 |
3 |
| MEAS6_NO2_SO2 |
-0.08 (-0.9, 0.8) |
0.98 (0.5, 1.5) |
-0.37 (-0.9, 0.8) |
-0.37 (-1.3, 0.2) |
| (Missing) |
70 |
10 |
57 |
3 |
Primary analysis
Likelihood ratio (LR) test (mix model)
Full model: \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * \text{NAP_1M_2M} + \beta_2 * \text{ACE}
+ \beta_3 * \text{FLU_PHE} + \beta_4 * \text{PYR} + \beta_5 *
\text{CHR}\\
& + \beta_6 * county + \beta_7 * BMI + \beta_8 * ses + \beta_{9} *
edu + \beta_{10} * curStove + \epsilon
\end{aligned}
\] Nested model: \[
\begin{aligned}
Y = & \beta_0 \\
& + \beta_1 * county + \beta_2 * BMI + \beta_3 * ses + \beta_4 *
edu + \beta_5 * curStove + \epsilon
\end{aligned}
\] \(H_0\): The full model and
the nested model fit the data equally well. Thus, you should use the
nested model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.4408 0.6169
## Hannum EAA 0.4716 0.6169
## PhenoAge EAA 0.0226 0.1808
## Skin&Blood EAA 0.8595 0.8595
## GrimAge EAA 0.0945 0.2520
## DNAmTL 0.0815 0.2520
## IEAA 0.5398 0.6169
## EEAA 0.5041 0.6169
Linear regression
In the following section, we performed linear regression with
equation \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * \text{NAP_1M_2M} + \beta_2 * \text{ACE}
+ \beta_3 * \text{FLU_PHE} + \beta_4 * \text{PYR} + \beta_5 *
\text{CHR}\\
& + \beta_6 * county + \beta_7 * BMI + \beta_8 * ses + \beta_{9} *
edu + \beta_{10} * curStove + \epsilon
\end{aligned}
\] where \(Y\) is one of the
epigenetic age accelerations.
The estimations of \(\beta_1\),
\(\beta_2\), \(\beta_3\), \(\beta_4\), and \(\beta_5\) with given \(Y\) are shown below. The \(\beta_1\), \(\beta_2\), \(\beta_3\), \(\beta_4\), and \(\beta_5\) can be interpreted as “the
expected change of Y if increase one unit of given exposure prototype,
while holding other variables constant”.

Sensitivity analyses
Likelihood ratio (LR) test (single model)
Full model: \[Y = \beta_0 + \beta_1 *
\text{NAP_1M_2M} + \beta_2 * \text{ACE} + \beta_3 * \text{FLU_PHE} +
\beta_4 * \text{PYR} + \beta_5 * \text{CHR} + \epsilon\]
Nested model: \[Y = \beta_0 +
\epsilon\]
\(H_0\): The full model and the
nested model fit the data equally well. Thus, you should use the nested
model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.7166 0.9146
## Hannum EAA 0.8605 0.9146
## PhenoAge EAA 0.0779 0.5412
## Skin&Blood EAA 0.5460 0.9146
## GrimAge EAA 0.2178 0.5808
## DNAmTL 0.1353 0.5412
## IEAA 0.7881 0.9146
## EEAA 0.9146 0.9146
Likelihood ratio (LR) test (single model) with subjects using only
smoky or smokeless coal
Full model: \[Y = \beta_0 + \beta_1 *
\text{NAP_1M_2M} + \beta_2 * \text{ACE} + \beta_3 * \text{FLU_PHE} +
\beta_4 * \text{PYR} + \beta_5 * \text{CHR} + \epsilon\]
Nested model: \[Y = \beta_0 +
\epsilon\]
\(H_0\): The full model and the
nested model fit the data equally well. Thus, you should use the nested
model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.5630 0.9535
## Hannum EAA 0.7871 0.9535
## PhenoAge EAA 0.1480 0.9535
## Skin&Blood EAA 0.9124 0.9535
## GrimAge EAA 0.8240 0.9535
## DNAmTL 0.5162 0.9535
## IEAA 0.7267 0.9535
## EEAA 0.9535 0.9535
Likelihood ratio (LR) test (single model) with subjects only using
smoky coal
Full model: \[Y = \beta_0 + \beta_1 *
\text{NAP_1M_2M} + \beta_2 * \text{ACE} + \beta_3 * \text{FLU_PHE} +
\beta_4 * \text{PYR} + \beta_5 * \text{CHR} + \epsilon\]
Nested model: \[Y = \beta_0 +
\epsilon\]
\(H_0\): The full model and the
nested model fit the data equally well. Thus, you should use the nested
model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.7266 0.9357
## Hannum EAA 0.8797 0.9357
## PhenoAge EAA 0.1130 0.9040
## Skin&Blood EAA 0.6715 0.9357
## GrimAge EAA 0.8736 0.9357
## DNAmTL 0.6495 0.9357
## IEAA 0.8091 0.9357
## EEAA 0.9357 0.9357
Ambient Exposure
Summary the exposure estimates:
| Characteristic |
Overall, N = 112 |
Smokeles, N = 17 |
Smoky, N = 87 |
Wood_and_or_Plant, N = 8 |
| bap_air |
39.44 (18.9, 74.1) |
10.09 (4.5, 20.7) |
45.22 (21.9, 76.7) |
69.11 (57.0, 131.2) |
| (Missing) |
4 |
0 |
3 |
1 |
| pm25_air |
139.32 (100.1, 227.1) |
120.16 (102.4, 160.7) |
137.48 (98.3, 211.0) |
421.89 (252.7, 480.4) |
| ANY_air |
564.51 (305.8, 977.5) |
477.86 (187.2, 791.4) |
560.77 (306.0, 914.7) |
7,030.90 (3,125.6, 10,967.7) |
| (Missing) |
35 |
7 |
24 |
4 |
| BPE_air |
46.55 (19.5, 73.4) |
12.70 (3.9, 19.7) |
48.29 (22.9, 83.4) |
66.81 (42.4, 114.8) |
| (Missing) |
4 |
0 |
3 |
1 |
| BaA_air |
40.51 (16.7, 88.1) |
9.44 (2.9, 23.3) |
50.23 (20.7, 106.2) |
68.31 (61.8, 163.2) |
| (Missing) |
4 |
0 |
3 |
1 |
| BbF_air |
62.69 (32.8, 120.9) |
31.76 (13.5, 50.1) |
65.78 (34.5, 124.7) |
88.69 (78.2, 181.6) |
| (Missing) |
4 |
0 |
3 |
1 |
| BkF_air |
13.24 (6.4, 25.9) |
3.37 (2.0, 7.6) |
15.07 (8.0, 28.6) |
27.64 (12.5, 48.0) |
| (Missing) |
4 |
0 |
3 |
1 |
| CHR_air |
45.82 (16.4, 86.9) |
15.24 (4.9, 31.8) |
50.79 (18.1, 86.9) |
91.89 (61.3, 134.8) |
| (Missing) |
4 |
0 |
3 |
1 |
| DBA_air |
12.49 (4.4, 27.5) |
3.92 (1.4, 11.0) |
14.25 (6.1, 31.8) |
12.67 (7.6, 25.3) |
| (Missing) |
4 |
0 |
3 |
1 |
| FLT_air |
17.33 (5.1, 41.6) |
4.35 (0.6, 7.2) |
19.15 (6.5, 41.8) |
104.71 (48.9, 175.2) |
| (Missing) |
4 |
0 |
3 |
1 |
| FLU_air |
276.10 (165.2, 546.9) |
251.42 (219.0, 298.2) |
276.10 (159.0, 544.6) |
1,426.05 (632.8, 2,241.9) |
| (Missing) |
35 |
7 |
24 |
4 |
| IPY_air |
27.29 (14.0, 47.7) |
12.70 (4.3, 16.6) |
30.70 (15.3, 48.1) |
69.17 (51.1, 118.8) |
| (Missing) |
4 |
0 |
3 |
1 |
| NAP_air |
3,170.67 (1,807.5, 5,568.9) |
3,217.69 (2,288.3, 4,623.5) |
3,142.04 (1,759.1, 5,442.8) |
29,828.64 (11,068.1, 49,775.1) |
| (Missing) |
35 |
7 |
24 |
4 |
| PHE_air |
396.14 (220.9, 820.9) |
363.30 (294.3, 550.4) |
380.03 (206.2, 771.8) |
2,120.65 (907.6, 3,404.2) |
| (Missing) |
35 |
7 |
24 |
4 |
| PYR_air |
21.81 (6.1, 51.3) |
6.42 (0.6, 8.2) |
23.96 (7.7, 51.3) |
108.99 (71.5, 191.4) |
| (Missing) |
4 |
0 |
3 |
1 |
Primary analysis
Linear regression for each ambient exposure measurement (mix
model)
In the following section, we performed linear regression with
equation \[
\begin{aligned}
Y = & \beta_0 + \beta_1 *X\\
& + \beta_2 * county + \beta_3 * BMI + \beta_4 * ses + \beta_5 *
edu + \beta_6 * curStove + \epsilon
\end{aligned}
\] where \(Y\) is one of the
epigenetic age accelerations, and \(X\)
is one of the ambient exposure measurements.
The estimations of \(\beta_1\) with
given \(Y\) and \(X\) are shown below, which can be
interpreted as “the mean of Y changes given a one-unit increase in X
while holding other variables constant”.

Likelihood ratio (LR) test (mix model)
Full model: \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * \text{bap} + \beta_2 * \text{pm25} +
\beta_3 * \text{ANY} + \beta_4 * \text{BPE} + \beta_5 * \text{BaA} \\
& + \beta_6 * \text{BbF} + \beta_7 * \text{BkF} + \beta_8 *
\text{CHR} + \beta_9 * \text{DBA} + \beta_{10} * \text{FLT} \\
& + \beta_{11} * \text{FLU} + \beta_{12} * \text{IPY} + \beta_{13}
* \text{NAP} + \beta_{14} * \text{PHE} + \beta_{15} * \text{PYR} \\
& + \beta_{16} * county + \beta_{17} * BMI + \beta_{18} * ses +
\beta_{19} * edu + \beta_{20} * curStove + \epsilon
\end{aligned}
\] Nested model: \[
\begin{aligned}
Y = & \beta_0 \\
& + \beta_1 * county + \beta_2 * BMI + \beta_3 * ses + \beta_4 *
edu + \beta_5 * curStove + \epsilon
\end{aligned}
\] \(H_0\): The full model and
the nested model fit the data equally well. Thus, you should use the
nested model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.0101 0.0404
## Hannum EAA 0.2310 0.2310
## PhenoAge EAA 0.0616 0.1232
## Skin&Blood EAA 0.1150 0.1621
## GrimAge EAA 0.0011 0.0088
## DNAmTL 0.1860 0.2126
## IEAA 0.0198 0.0528
## EEAA 0.1216 0.1621
Sensitivity analysis
Linear regression for each ambient exposure measurement (simple
model)
In the following section, we performed linear regression with
equation \[Y = \beta_0 + \beta_1 *X +
\epsilon\] where \(Y\) is one of
the epigenetic age accelerations, and \(X\) is one of the ambient exposure
measurements.
The estimations of \(\beta_1\) with
given \(Y\) and \(X\) are shown below, which can be
interpreted as “the mean of Y changes given a one-unit increase in X
while holding other variables constant”.

Likelihood ratio (LR) test (single model)
Full model: \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * \text{bap} + \beta_2 * \text{pm25} +
\beta_3 * \text{ANY} + \beta_4 * \text{BPE} + \beta_5 * \text{BaA} \\
& + \beta_6 * \text{BbF} + \beta_7 * \text{BkF} + \beta_8 *
\text{CHR} + \beta_9 * \text{DBA} + \beta_{10} * \text{FLT} \\
& + \beta_{11} * \text{FLU} + \beta_{12} * \text{IPY} + \beta_{13}
* \text{NAP} + \beta_{14} * \text{PHE} + \beta_{15} * \text{PYR} \\
& + \epsilon
\end{aligned}
\]
Nested model: \[Y = \beta_0 +
\epsilon\]
\(H_0\): The full model and the
nested model fit the data equally well. Thus, you should use the nested
model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.0124 0.0440
## Hannum EAA 0.2385 0.3023
## PhenoAge EAA 0.0864 0.1728
## Skin&Blood EAA 0.2439 0.3023
## GrimAge EAA 0.0165 0.0440
## DNAmTL 0.5855 0.5855
## IEAA 0.0154 0.0440
## EEAA 0.2645 0.3023
Likelihood ratio (LR) test (single model) with subjects using only
smoky or smokeless coal
Full model: \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * \text{bap} + \beta_2 * \text{pm25} +
\beta_3 * \text{ANY} + \beta_4 * \text{BPE} + \beta_5 * \text{BaA} \\
& + \beta_6 * \text{BbF} + \beta_7 * \text{BkF} + \beta_8 *
\text{CHR} + \beta_9 * \text{DBA} + \beta_{10} * \text{FLT} \\
& + \beta_{11} * \text{FLU} + \beta_{12} * \text{IPY} + \beta_{13}
* \text{NAP} + \beta_{14} * \text{PHE} + \beta_{15} * \text{PYR} \\
& + \epsilon
\end{aligned}
\]
Nested model: \[Y = \beta_0 +
\epsilon\]
\(H_0\): The full model and the
nested model fit the data equally well. Thus, you should use the nested
model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.4481 0.9041
## Hannum EAA 0.6843 0.9041
## PhenoAge EAA 0.6391 0.9041
## Skin&Blood EAA 0.9041 0.9041
## GrimAge EAA 0.1945 0.9041
## DNAmTL 0.8384 0.9041
## IEAA 0.2352 0.9041
## EEAA 0.6906 0.9041
Likelihood ratio (LR) test (single model) with subjects only using
smoky coal
Full model: \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * \text{bap} + \beta_2 * \text{pm25} +
\beta_3 * \text{ANY} + \beta_4 * \text{BPE} + \beta_5 * \text{BaA} \\
& + \beta_6 * \text{BbF} + \beta_7 * \text{BkF} + \beta_8 *
\text{CHR} + \beta_9 * \text{DBA} + \beta_{10} * \text{FLT} \\
& + \beta_{11} * \text{FLU} + \beta_{12} * \text{IPY} + \beta_{13}
* \text{NAP} + \beta_{14} * \text{PHE} + \beta_{15} * \text{PYR} \\
& + \epsilon
\end{aligned}
\]
Nested model: \[Y = \beta_0 +
\epsilon\]
\(H_0\): The full model and the
nested model fit the data equally well. Thus, you should use the nested
model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.7795 0.9656
## Hannum EAA 0.8191 0.9656
## PhenoAge EAA 0.4285 0.9656
## Skin&Blood EAA 0.8324 0.9656
## GrimAge EAA 0.0552 0.4416
## DNAmTL 0.9656 0.9656
## IEAA 0.3845 0.9656
## EEAA 0.8923 0.9656
Urinary Measurements
Summary the exposure estimates:
| Characteristic |
Overall, N = 112 |
Smokeles, N = 17 |
Smoky, N = 87 |
Wood_and_or_Plant, N = 8 |
| Benzanthracene_Chrysene_urine |
0.38 (0.3, 0.8) |
0.29 (0.1, 0.6) |
0.45 (0.3, 1.0) |
0.36 (0.3, 0.6) |
| (Missing) |
2 |
0 |
2 |
0 |
| Naphthalene_urine |
107.58 (72.1, 168.8) |
96.94 (54.9, 110.9) |
108.85 (73.5, 169.3) |
141.97 (99.7, 174.6) |
| Methylnaphthalene_2_urine |
26.67 (17.9, 45.0) |
17.92 (8.8, 23.4) |
30.18 (20.9, 46.4) |
20.30 (12.2, 34.2) |
| (Missing) |
7 |
0 |
7 |
0 |
| Methylnaphthalene_1_urine |
10.93 (6.6, 18.1) |
5.26 (3.6, 10.5) |
11.52 (7.7, 20.9) |
15.06 (11.0, 26.7) |
| (Missing) |
4 |
1 |
3 |
0 |
| Acenaphthene_urine |
3.14 (2.2, 7.3) |
2.82 (2.2, 3.5) |
3.38 (2.3, 7.9) |
3.58 (2.0, 7.2) |
| Phenanthrene_Anthracene_urine |
112.78 (42.4, 239.6) |
78.75 (41.6, 135.5) |
115.58 (56.8, 239.7) |
109.86 (39.6, 305.8) |
| Fluoranthene_urine |
16.53 (6.1, 23.1) |
17.68 (5.4, 20.8) |
15.25 (6.3, 23.2) |
23.23 (22.4, 36.0) |
| Pyrene_urine |
0.54 (0.4, 0.8) |
0.41 (0.4, 0.4) |
0.54 (0.4, 0.8) |
0.78 (0.7, 0.9) |
| (Missing) |
15 |
7 |
7 |
1 |
Primary analysis
Linear regression for each urinary exposure measurement (mix
model)
In the following section, we performed linear regression with
equation \[
\begin{aligned}
Y = & \beta_0 + \beta_1 *X\\
& + \beta_2 * county + \beta_3 * BMI + \beta_4 * ses + \beta_5 *
edu + \beta_6 * curStove + \epsilon
\end{aligned}
\] where \(Y\) is one of the
epigenetic age accelerations, and \(X\)
is one of the urinary exposure measurements.
The estimations of \(\beta_1\) with
given \(Y\) and \(X\) are shown below, which can be
interpreted as “the mean of Y changes given a one-unit increase in X
while holding other variables constant”.

Likelihood ratio (LR) test (mix model)
Full model: \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * \text{Benzanthracene_Chrysene} + \beta_2
* \text{Naphthalene} \\
& + \beta_3 * \text{2.Methylnaphthalene} + \beta_4 *
\text{1.Methylnaphthalene} \\
& + \beta_5 * \text{Acenaphthene }+ \beta_6 *
\text{Phenanthrene_Anthracene} \\
& + \beta_7 * \text{Phenanthrene_Anthracene} + \beta_8 *
\text{Fluoranthene} + \beta_9 * \text{Pyrene} \\
& + \beta_{10} * county + \beta_{11} * BMI + \beta_{12} * ses +
\beta_{13} * edu + \beta_{14} * curStove + \epsilon
\end{aligned}
\] Nested model: \[
\begin{aligned}
Y = & \beta_0 \\
& + \beta_1 * county + \beta_2 * BMI + \beta_3 * ses + \beta_4 *
edu + \beta_5 * curStove + \epsilon
\end{aligned}
\] \(H_0\): The full model and
the nested model fit the data equally well. Thus, you should use the
nested model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.0001 0.0003
## Hannum EAA 0.0131 0.0150
## PhenoAge EAA 0.0000 0.0000
## Skin&Blood EAA 0.0009 0.0014
## GrimAge EAA 0.0004 0.0008
## DNAmTL 0.0121 0.0150
## IEAA 0.0000 0.0000
## EEAA 0.0289 0.0289
Sensitivity analysis
Linear regression for each ambient exposure measurement (simple
model)
In the following section, we performed linear regression with
equation \[Y = \beta_0 + \beta_1 *X +
\epsilon\] where \(Y\) is one of
the epigenetic age accelerations, and \(X\) is one of the urinary measurements.
The estimations of \(\beta_1\) with
given \(Y\) and \(X\) are shown below, which can be
interpreted as “the mean of Y changes given a one-unit increase in X
while holding other variables constant”.

Likelihood ratio (LR) test (single model)
Full model: \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * \text{Benzanthracene_Chrysene} + \beta_2
* \text{Naphthalene} \\
& + \beta_3 * \text{2.Methylnaphthalene} + \beta_4 *
\text{1.Methylnaphthalene} \\
& + \beta_5 * \text{Acenaphthene }+ \beta_6 *
\text{Phenanthrene_Anthracene} \\
& + \beta_7 * \text{Phenanthrene_Anthracene} + \beta_8 *
\text{Fluoranthene} + \beta_9 * \text{Pyrene} \\
& + \epsilon
\end{aligned}
\] Nested model: \[Y = \beta_0 +
\epsilon\]
\(H_0\): The full model and the
nested model fit the data equally well. Thus, you should use the nested
model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.0012 0.0048
## Hannum EAA 0.0860 0.0983
## PhenoAge EAA 0.0009 0.0048
## Skin&Blood EAA 0.0075 0.0150
## GrimAge EAA 0.0145 0.0232
## DNAmTL 0.0342 0.0456
## IEAA 0.0029 0.0077
## EEAA 0.1986 0.1986
Likelihood ratio (LR) test (single model) with subjects using only
smoky or smokeless coal
Full model: \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * \text{Benzanthracene_Chrysene} + \beta_2
* \text{Naphthalene} \\
& + \beta_3 * \text{2.Methylnaphthalene} + \beta_4 *
\text{1.Methylnaphthalene} \\
& + \beta_5 * \text{Acenaphthene }+ \beta_6 *
\text{Phenanthrene_Anthracene} \\
& + \beta_7 * \text{Phenanthrene_Anthracene} + \beta_8 *
\text{Fluoranthene} + \beta_9 * \text{Pyrene} \\
& + \epsilon
\end{aligned}
\] Nested model: \[Y = \beta_0 +
\epsilon\]
\(H_0\): The full model and the
nested model fit the data equally well. Thus, you should use the nested
model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.0318 0.0636
## Hannum EAA 0.1305 0.1959
## PhenoAge EAA 0.0049 0.0196
## Skin&Blood EAA 0.0009 0.0072
## GrimAge EAA 0.4144 0.4144
## DNAmTL 0.1663 0.1959
## IEAA 0.0170 0.0453
## EEAA 0.1714 0.1959
Likelihood ratio (LR) test (single model) with subjects only using
smoky coal
Full model: \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * \text{Benzanthracene_Chrysene} + \beta_2
* \text{Naphthalene} \\
& + \beta_3 * \text{2.Methylnaphthalene} + \beta_4 *
\text{1.Methylnaphthalene} \\
& + \beta_5 * \text{Acenaphthene }+ \beta_6 *
\text{Phenanthrene_Anthracene} \\
& + \beta_7 * \text{Phenanthrene_Anthracene} + \beta_8 *
\text{Fluoranthene} + \beta_9 * \text{Pyrene} \\
& + \epsilon
\end{aligned}
\] Nested model: \[Y = \beta_0 +
\epsilon\]
\(H_0\): The full model and the
nested model fit the data equally well. Thus, you should use the nested
model.
\(H_A\): The full model fits the data
significantly better than the nested model. Thus, you should use the
full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.3420 0.4325
## Hannum EAA 0.3784 0.4325
## PhenoAge EAA 0.0197 0.1076
## Skin&Blood EAA 0.1371 0.3656
## GrimAge EAA 0.4957 0.4957
## DNAmTL 0.2194 0.4325
## IEAA 0.0269 0.1076
## EEAA 0.3324 0.4325